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Cographs are graphs in which no four vertices induce a simple connected path P4. Cograph editing is to find for 
a given graph G = a set of at most k edge additions and deletions that transform G into a cograph. This 

combinatorial optimization problem is NP-hard. It has, recently found applications in the context of phylogenetics, 
hence good heuristics are of practical importance. 

It is well-known that the cograph editing problem can be solved independently on the so-called strong prime modules 
of the modular decomposition of G. We show here that editing the induced /Ij’s of a given graph is equivalent to 
resolving strong prime modules by means of a newly defined merge operation l±l on the submodules. This observation 
leads to a new exact algorithm for the cograph editing problem that can be used as a starting point for the conshuction 
of novel heuristics. 
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1 Introduction 

Cographs are among the best-studied graph classes. In particular the fact that many problems that are NP- 
complete for arbitrary graphs become polynomial-time solvable on cographs 0CPS85IIBLS99IIGHN13II 
makes them an attractive starting point for constructing heuristics. As noted already in IICLSB81I . the 
input for several combinatorial optimization problems, such as exam scheduling or several variants of 
clustering problems, is naturally expected to have few induced P4S. Since graphs without an induced P 4 
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are exactly the cographs, identifying the closest cograph and solving the problem at hand for the modified 
input becomes a viable strategy. 

It was recently shown that orthology, a key concept in evolutionary biology in phylogenetics, is inti¬ 
mately tied to cographs. Two genes in a pair of related species are said to be orthologous if their last 
common ancestor was a speciation event. The orthology relation on a set of genes forms a cograph 
llHHRH+131 . This relation can be estimated directly from biological sequence data, albeit in a necessar¬ 
ily noisy form. Correcting such an initial estimate to the nearest cograph, i.e., cograph-editing, thus has 
recently become an computational problem of considerable practical interest in computational biology 
llHWL+151 . However, the (decision version of the) problem to edit a given graph into a cograph is NP- 
complete OLWGCl 1IILWGC121 . We showed in llHWL+15l that the cograph-editing problem is amenable 
to formulations as Integer Linear Programs (ILP). Computational experiments showed, however, that the 
performance of the ILP scales not very favorably, thus limiting exact ILP solutions in practice to moderate¬ 
sized data. Fast and accurate heuristics for cograph-editing are therefore of immediate practical interest 
in the held of phylogenomics. 

The cotree of a cograph coincides with the modular decomposition tree 0Gal67l . which is dehned for 
all graphs. We investigate here how edge editing on an approximate cograph is related to editing the 
corresponding modular decomposition trees. 


2 Basic Definitions 

We consider simple hnite undirected graphs G = {V,E) without loops. The notation G + e, G — e and 
G A e is is used to denote the graph {V,E U {e}), {V,E \ {e}) and {V,E A {e}), respectively. A graph 
H = {W,E') is a subgraph of a graph G = (V,£), in symbols // C G, if VT C V and E' C E. If // C G 
and xy G E’ if and only if xy GE for all x,y GW, then H is called an induced subgraph. We often denote 
such an induced subgraph H = {W,E') by G[W]. A connected component of G is a connected induced 
subgraph that is maximal w.r.t. inclusion. The complement G of a graph G = (V,£) has vertex set V and 
edge set E(G) = {xy \ x,y G V,x ^ y,xy ^ E}. The complete graph K\y\ = {y,E) has edge set E — 

We write G for two isomorphic graphs G and H. 

Let G — (y,E) be a graph. The (open) neighborhood N{v) is dehned as A^(v) = {x \ vx G E}. The 
(closed) neighborhood N[v] is then N[v] = A^(v) U {v}. If there is a risk of confusion we will write Nc{v), 
resp., A^g[v] to indicate that the respective neighborhoods are taken w.r.t. G. The degree deg(v) of a vertex 
is dehned as deg(v) = |A^(v)|. 

A tree is a connected graph that does not contain cycles. A path is a tree where every vertex has degree 
1 or 2. A rooted tree T = (LjA) is a tree with one distinguished vertex p G V. The hrst inner vertex 
lca(x,y) that lies on both unique paths from two vertices x, resp., y to the root, is called lowest common 
ancestor of x and y. It is well-known that there is a one-to-one correspondence between (isomorphism 
classes of) rooted trees on V and so-called hierarchies on V. For a hnite set V, a hierarchy on V is a subset 
^ of the power set l^(y) such that (i) V G^, (h) {x} G ^ for allx S V and (iii) pFqG {p,q,flf\ for all 
p,qG^. 


Theorem 2.1 ( |[SS03l ) Let be a collection of non-empty subsets of V. Then, there is a rooted tree 
T = (W,E) on V with = {L(v) | v G W} if and only if^ is a hierarchy on V. 
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3 Cographs, P 4 -sparse Graphs and the Modular Decomposition 

3.1 Introduction to Cographs 

Cographs are defined as the class of graphs formed from a single vertex under the closure of the operations 
of union and complementation, namely; (i) a single-vertex graph Ki is a cograph; (ii) the disjoint union 
G = {Vi □£’ 2 ) of cographs Gi = (VijGi) and G 2 = {V 2 ,E 2 ) is a cograph; (iii) the complement 

G of a cograph G is a cograph. The name cograph originates from complement reducible graphs, as by 
dehnition, cographs can be “reduced” by stepwise complementation of connected components to totally 
disconnected graphs 0Sei74L 

It is well-known that for each induced subgraph H of a cograph G either H is disconnected or its 
complement H is disconnected IIBLS99II . This, in particular, allows representing the structure of a cograph 
G = {y,E) in an unambiguous way as a rooted tree T = {W,F), called cotree: If the considered cograph 
is the single vertex graph Ki, then output the tree ({m}, 0). Else if the given cograph G is connected, 
create an inner vertex u in the cotree with label “series”, build the complement G and add the connected 
components of G as children of u. If G is not connected, then create an inner vertex u in the cotree with 
label “parallel” and add the connected components of G as children of u. Proceed recursively on the 
respective connected components that consists of more than one vertex. Eventually, this cotree will have 
leaf-set V CW and inner vertices u GW\V are labeled with either “parallel” or “series” s.t. xy S £ if and 
only if M = lca 7 (x,y) is labeled “series”. Since a cograph and its cotree are uniquely determined by each 
other, one can use the cotree representation to test in linear-time whether two cographs are isomorphic 
IICPS851 . 

The complement of a path on four vertices £4 is again a £4 and hence, such graphs are not cographs. 
Intriguingly, cographs have indeed a quite simple characterization as P^-free graphs, that is, no four ver¬ 
tices induce a £ 4 . A number of further equivalent characterizations are given in IIBLS99II and Theorem]^ 
Determining whether a graph is a cograph can be done in linear time IICPS85l[BCHP08l . 

3.2 Modules and the Modular Decomposition 

The concept of modular decompositions (MD) is dehned for arbitrary graphs G. It present the structure of 
G in the form of a tree that generalizes the idea of cotrees. However, in general much more information 
needs be stored at the inner vertices of this tree if the original graph is to be recovered. 

The MD is based on modules, which are also known as autonomous sets IIMR84I lMoh85l . closed sets 
IIGal67l , clan IIEGMS94I . stable sets, clumps IIBla78l or externally related sets IIHM79II . A module of 
a given graph G = {V,E) is a subset M CV with the property that for all vertices in x,y G M holds 
that N{y) \ M = N{x) \ M. The vertices within a given module M are therefore not distinguishable by 
the part of their neighborhoods that lie “outside” M. Modules can thus be seen as generalization of the 
notion of connected components. We denote with M(G) the set of all modules of G = (V,£). Clearly, 
the vertex set V and the singletons {v}, v GV are modules, called trivial modules. A graph G is called 
prime if it only contains trivial modules. Eor a module M of G and a vertex v S M, we dehne the out^^- 
neighborhood of v in G as the set Nq (v) := Ng{v) \M. Since for each v,w G M the outM-neighborhoods 
are identical, Nq{v) =Nq(w), we can equivalently dehne the outM-neighborhood of the module M as 
N^:=N^{v), vGM. 

Eor a graph G = (V,£) let M and M' be disjoint subsets of V. We say that M and M' are adjacent (in 
G) if each vertex of M is adjacent to all vertices of M'; the sets are non-adjacent if none of the vertices 
of M is adjacent to a vertices of M'. Two disjoint modules are either adjacent or non-adjacent IIMoh851 . 
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One can therefore define the quotient graph G/M! for an arbitrary subset M' C M(G) of pairwise disjoint 
modules: G/M' has M' as its vertex set and {Mi,Mj) S ^(G/M') if and only if M; and Mj are adjacent 
in G. 

A module M is called strong if for any module M' either M DM' = ov M C M', ov M C M', i.e., 
a strong module does not overlap any other module. The set of all strong modules MD(G) C M(G) thus 
forms a hierarchy, the so-called modular decomposition of G. While arbitrary modules of a graph form 
a potentially exponential-sized family, however, the sub-family of strong modules has size G(|y(G)|) 
IIHDMP041 . 

Let P = {Ml,... jM^.} be a partition of the vertex set of a graph G = {V,E). If every M,- S P is a module 
of G, then P is a modular partition of G. A non-trivial modular partition P = {Mi,... ,Mk} that contains 
only maximal (w.r.t inclusion) strong modules is a maximal modular partition. We denote the (unique) 
maximal modular partition of G by Pmax(G)- We will refer to the elements of Pmax(G[4^]) as the the 
children ofM. This terminology is motivated by the following considerations: 

The hierarchical structure of MD(G) gives rise to a canonical tree representation of G, which is usually 
called the modular decomposition tree MDT(G) IIMR84IIHP101 . The root of this tree is the trivial module 
V and its |y | leaves are the trivial modules {v}, v GV. The set of leaves L,, associated with the subtree 
rooted at an inner vertex v induces a strong module of G. Moreover, inner vertices v are labeled “parallel” 
if the induced subgraph G[Lv] is disconnected, “series” if the complement G[Lv] is disconnected and 
“prime” otherwise, i.e., if G[Z^] and G[L,,] are both connected. The module Lv of the induced subgraph 
G[Lv] associated to a vertex v labeled “prime” is called prime module. Note, the latter does not imply 
that G[Lv] is prime, however, in all cases G[Lv]/Pmax(G[Tvi]) is prime OHPIOI . Similar to cotrees it holds 
that xy G E if u — Ica]y[j 3 i-(G) {f/) i® labeled “series”, and xy ^E if u — lca]y[£)'j'((j) (xy) is labeled “parallel”. 
However, to trace back the full structure of a given graph G from MDT(G) one has to store additionally the 
information of the subgraph G[L,,]/Pmax(G[Tv]) in the vertices v labeled “prime”. Although, MD(G) C 
M(G) does not represent all modules, we state the following remarkable fact IIMoh85l |DGM97I : Any 
subset M C y is a module if and only if M G MD(G) orM is the union of children of non-prime modules. 
Thus, MDT(G) represents at least implicitly all modules of G. 

A simple polynomial time recursive algorithm to compute MDT(G) is as follows BHPIOII : (1) compute 
the maximal modular partition Pmax(G); (2) label the root node according to the parallel, series or prime 
type of G; (3) for each strong module M of Pmax, compute MDT(G[M]) and attach it to the root node. 
The first polynomial algorithm to compute the modular decomposition is due to Cowan et al. IIC.IS72I . 
and it runs in Gdyj'*). Improvements are due to Habib and Maurer IIHM79I . who proposed a cubic time 
algorithm, and to Muller and Spinrad MMS89I . who designed a quadratic time algorithm. The first two 
linear time algorithms appeared independently in 1994 IICH941IMS94II . Since then a series of simplified 
algorithms has been published, some running in linear time BDGMOll IMS99IITCHP08I . and others in 
almost linear time BDGMO 1 1 IMSOOl IHPV991IHDMP04I . 

We give here two simple lemmata for further reference. 

Lemma 3.1 Let M be a module of a graph G = {V,E) and M' C M. Then M' is a module o/G[M] if and 
only ifM' is a module ofG. 

Furthermore, suppose M G MD(G) is strong module of G. Then M' is a strong module o/G[M] if and 
only if M’ is a strong module of G. 

Proof: Let M G M(G). If M' is a module of G[M], then all x,y G M' have the same out^/-neighbors 
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in G[M\. Since M is a module of G and M' C M, for all x,y € M' the out^/-neighborhood and ouIm- 
neighborhood in G\V\M] are identical. Thus, all x,y € M' have the same out^/-neighborhood in G. 

If M' C M is a module in G then, in particular, the out^/-neighborhood in G[M] must be identical for 
all x,y€M', and thus M' is a module in G[M]. 

Let M G MD(G) and assume that M' is a strong module of G[M]. Since M is a strong module in G 
it does not overlap any other modules in G. Assume for contradiction that M' is not a strong module of 
G. Hence M' must overlap some module M" in G. This module M" cannot be entirely contained in M as 
otherwise, M" and M' overlap in G[M] implying that M' is not a strong module of G[M], a contradiction. 
But then M and M” must overlap, contradicting that M G MD(G). 

If M' is a strong module of G then it does not overlap any module of G. As every module of G is also 
a module of G[M] (and vice versa) it follows that M' does not overlap any module of G[M] and thus, M' 
must be a strong module of G[M]. □ 

Lemma 3.2 Let G be an arbitrary graph and G' be a cograph on the same vertex set V so that M(G) C 
M(G^), i.e., every module of G is a module of G'. Moreover, let Pmax := Pmax(G) be the maximal modular 
partition of G. Then Pmax A a modular partition of G' and GY Pmax A o cograph. 

Proof: Since M(G) C M(G') we can immediately conclude that Pmax is a (not necessarily maximal) 
modular partition of G' and therefore the quotient G'/Pmax is well-defined. Assume, for contradiction, 
that G'/Pmax is not a cograph. Then G'/Pmax must contain one induced P 4 , say Mi — M 2 — M 3 — M 4 . As 
Ml,... ,M 4 are modules of G' and since two disjoint modules are either adjacent or non-adjacent it follows 
that G' must contain an induced P 4 of the form xi — X 2 — X 3 — X 4 with x, G M,, 1 < / < 4, a contradiction. 

□ 


3.3 The Twin-Relation 

A special kind of module that will play a central role in this contribution are twins. Two vertices x,y GV 
are called twins if {x,y} is a module of G. Twins x,y GV are called true twins if xy G £ and false twins 
otherwise. Twins x and y therefore satisfy N{x) \ {y} = N{y) \ {x}. In particular, for true twins {x,y} we 
can infer that A[x] = A[y] and for false twins {x,y} we only have N{x) = N{y). 

Definition 1 Let G = {V,E) be an arbitrary graph. The twin relation if" is the binary relation on V that 
contains all pairs of twins: (x,y) G IT if and only ifx,y are twins in G. The pair (x,x) G fT is called trivial 
twin. 

Unless explicitly stated, we will use the phrase “a pair of twins” or “twins”, for short, to refer only to 
non-trivial twins. 

Proposition 3.3 Let G = {V,E) be a given graph and the twin relation on V. Then the following 
statements hold: 

1. The relation is an equivalence relation. 

2. Lor every equivalence class M fC the distinct elements ofM are either all true or false twins. 

3. Every equivalence class M ^ T is a module of G and there is no other non-trivial strong module 
contained in M. 
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Proof: We first prove Statement 1. and 2.: Clearly, fiP is reflexive and symmetric. It remains to show that 
is transitive. Assume that (x,y), (x,z) G 3'. We show first that x,y and x,z can only be either true or 
false twins. Assume for contradiction that are false twins and v, z are true twins. Hence, {x,z) GE{G) 
and since N{x) = N{y) we have (y,z) G E (G). However, since y G A^[z] = A^[v] we have y G N{x) and hence 
(jc,y) G E{G), a contradiction. 

Let {x,y), (x,z) G he both false twins. Thus, N{z) = N{x) = N{y), which implies that {y,z) G SF. 
Moreover, since N{z) = Nij), the vertices y and z cannot be adjacent and thus, x and z are false twins. 
Now assume that (v,y), (v,z) £ 9" are both true twins. Hence, A[z] = A[x] =N[y] and, thus, (y,z) G E{G). 
Therefore, (y,z) £ 9 are true twins. 

Thus, T is transitive and hence, an equivalence relation, where each equivalence class M'= 9 comprises 
either only false or only true twins. 

Now, we prove Statement 3.: Statement 1. and 2. imply that M ^ 9 contains either only true or false 
twins. If M contains only false twins, then N{x) \M = N{x) = N{y) = N{y)\M for all x,y G M and thus, 
M is a module. If M contains only true twins x,y, then N{x) \ {y} = N{y) \ {x}. If there is an additional 
vertex zGM we must have N{x) \ {y,z} = N{y) \ {x,z}. Induction on the number of elements of M shows 
that N{x) \M = N{y) \M holds for all x,y £ M e 9. Therefore, M is a module. 

Finally, assume there is a non-trivial strong module M' contained in M ^ 9. Let x G M' C M and 
z £ M\M'. Since x,zG M ^ 9, they are twins. Thus, {x,z} is a module. Hence, M' cannot be strong 
module since {x,z} HM' = {x} and thus, {x,z} and M' overlap. 

□ 

Note that equivalence classes of 9 are not necessarily strong modules, as the following example shows. 
Consider the graph G = ({0,1,2,3}, {(2,3)}). The twin relation 9 on G has equivalence classes {0,1} 
and {2,3}. However, M = {1,2,3} is also a module, as N{i) \ M — <d, I < i < 3. In this case, M 
overlaps {0,1}. The modular decomposition MD(G) is {{0},{1},{2},{3},{2,3},{0,1,2,3}}, while 
M(G)\MD(G) = {{0,2,3},{1,2,3}}. 

3.4 Pq-sparse Graphs and Spiders 

Although the cograph-editing problem is NP-complete, it can be solved in polynomial time for so- 
called PA-sparse graphs OLWGCllI ILWGC121 . in which every set of five vertices induces at most one 
Pa IIHoa85l . The efficient recognition of Pq-sparse graphs is intimately connected to so-called spider 
graphs, a very peculiar class of prime graphs. 

Lemma 3.4 UOM I7G921/ . A graph G is Pa sparse if and only if exactly one of the following three 
alternatives is true for every induced subgraph FI ofG: (i) H is not connected, (ii) FI is not connected, or 
(Hi) H is a spider. 

Spiders come in two sub-types, called thin and thick IIJ0921ING12I . A graph G is a thin spider if its 
vertex set can be partitioned into three sets K, S, and R so that (i) /T is a clique; (ii) 5 is a stable set; (iii) 
1^1 = |‘5| > 2; (iv) every vertex in R is adjacent to all vertices of K and none of the vertices of 5; and (v) 
each vertex in K is connected to exactly one vertex in S by an edge and vice versa. A graph G is a thick 
spider if its complement G is a thin spider. The sets K, S, and R are usually referred to as the body, the set 
of legs, and head, resp., of a thin spider. The path Pa is the only graph that is both a thin and thick spider. 
Interestingly, spider graphs are fully characterized by its degree sequences llBCF+15| . 
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Lemma 3.4 in particular implies that any strong module M S MD(G) of a fl^-sparse graph is either (i) 
parallel (ii) series or (iii) prime, in which case it corresponds to a spider G[M]. In general, we might there¬ 
fore additionally distinguish prime modules M as those where G[M] is a spider (called spider modules) 
and those where G[M] is not a spider (which we still call simply prime modules). 


3.5 Useful Properties of Modular Partitions 

First, we briefly summarize the relationship between cographs G and the modular decomposition MD(G). 


Theorem 3.5 ( IICLSB811IBLS991 ) Let G — (y,E) be an arbitrary graph. Then the following statements 
are equivalent. 

1. G is a cograph. 

2. G does not contain induced paths on four vertices P 4 . 

3. MDT(G) is the cotree ofG and hence, has no inner vertices labeled with “prime”. 

4. Any non-trivial induced subgraph of G has at least one pair of twins {x,y}. 


For later explicit reference, we summarize in the next theorem several results that we already implicitly 
referred to in the discussion above. 


Theorem 3.6 ( lIGPPlOl iHPlOl IMohSSi ) The following statements are true for an arbitrary graph G = 
{V,E): 

(T1) The maximal modular partition Pmax(G) and the modular decomposition MD(G) o/G are unique. 

(T2) Let ¥ be a modular partition ofG. Then P C P ii a (non-trivial strong) module o/G/P if and only 
is a (non-trivial strong) module ofG. 

(T3) Let M be amodule of G and {a,b,c,d} be four vertices inducing a P 4 in G, then \Mn{a,b,c,d}\ < 1 
or {a,b,c,d} C M. 

(T4) Lor any connected graph G with G being connected, the quotient G/Pniax(G) is a prime graph. 

(T5) Let Pmax be the maximal modular partition of G[M], where M denotes a prime module of G and 
P^ S Pmax be a proper subset o/Pmax with | P^ | > 1. Then, UM/gp'M' f. M(G). 


Statements (Tl) and (T4) are clear. Statement (T2) characterizes the (non-trivial strong) module of G 
in terms of (non-trivial strong) modules P of G/P. Statement (T3) clarihes that each induced P 4 is either 
entirely contained in a module or intersects a module in at most one vertex. Statement (T5) explains that 
none of the unions of elements of a maximal modular partition of G[M] are modules of G. Hence, only 
the prime module M itself and, by Lemma 3.1 the elements M' S Pmax are modules of G. 
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4 Cograph Editing 

4.1 Optimal Modul-Preserving Edit Sets 

Given an arbitrary graph we are interested in the following optimization problem. 

Problem 4.1 (Cograph Editing) Given a graph G = iV,E). Find a set F C ( 2 ) of minimum cardinality 
s.t. G* = {V,E AF) is a cograph. 

We will simply call an edit sets of minimum cardinality an optimal edit set. 

The (decision version of the) cograph-editing problem is NP-complete BLWGClll ILWGC12I . Nev¬ 
ertheless, the cograph-editing problem is hxed-parameter tractable (FPT) IIPDdSS09l . Hence, for the 
parametrized version of this problem, i.e., for a given graph G = (P,£) and a parameter k>0 hnd a set F 
of at most k edges and non-edges so that G* = {y,E AE) is a cograph, there is an algorithm with running 
time G(6*) ICai96l . This FPT approach was improved in flLWGCl 1IILWGC12II to an G(4.612^ -f |y |4 5) 
time algorithm. These results are of little use for practical applications, because the constant k can become 
quite large. However, they provide deep insights into the structure of the class of P 4 -sparse graphs that 
slightly generalizes cographs. 

In particular, the cograph-editing problem can be solved in polynomial time whenever the input graph is 
P 4 -sparse BLWGCl 11ILWGC12II . The key observation is that every strong prime module M of a P 4 -sparse 
graph G is a spider module. The authors then proceed to show that it suffices to edit a fixed number of 
(non)legs, i.e., only (non)edges xy with x G K and y S 5 in G[M], for all such spider modules to eventually 
obtain an optimally edited cograph. The resulting algorithm to optimally edit a P 4 -sparse graph to a 
cograph, EDP4, runs in G(|y| -f jED-time. 

In the following will frequently make use of a result by Guillemot et al. lIGPPlOl that is based on the 
following 

Lemma 4.2 ( lIGPPlOl ) Let G — {y,E) be an arbitrary graph and let M be a non-trivial module of G. If 
Fm is an optimal edge-edition set of the induced subgraph G[M\ and Eopt is an optimal edge-edition set 
of G, then (i) F — (Eopt \ Eopt [M]) UFm is an optimal edge-edition set of G and (ii) F’opt[M] contains all 
(non-)edges xy G Fopt with x,y G M. 


Proposition 4.3 ( lIGPPlOl ) Every graph G{V,E) has an optimal edit set Fopi such that every module M 
ofG is module of the cograph Gopt = {y,E AEopt). 


An edit set as described in Prop. |4.3| is called module-preserving. Their importance lies in the fact that 
module-preserving edit sets update either all or none of the edges between any two disjoint modules. 

In the following Remark we collect a few simple consequences of our considerations so far. 


Remark 4.4 By Theorem \3.6\ (T3) and definition of cographs, all induced P^’s of a graph are entirely 
contained in the prime modules. By Lemma^A\ the maximal modular partition Pmax ofG[M] is a subset of 


the strong modules MD(G) C M(G),/or all strong modules M G MD(G). Taken together with Lemma 4.2. 


Proposition 4.3 and Theorem |3.6| (T1), this implies that it suffices to solve the cograph-editing problem 
for G independently on each ofG’s strong prime modules IIGPPIOV . 

An optimal module-preserving edit-set Fopi on G therefore induces optimal edit-sets Fm on G[M] for 
any M, and thus also optimal edit-set F’opt(M,Pniax) on G[M]/Pniax. where Pmax = {M \,. ■ ■ ,Mk\ is again 
the maximal modular partition o/G[M] and M is a module o/M(G). The edit set has the 
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F{M ,:= {{Mi,Mj} \ Mi,Mj e Pmax € M,-,y e Mj with {x,y} € Fopt[M]}. 


4.2 Optimal Module Merge Deletes All P 4 ’s 

Since cographs are characterized by the absence of induced i^^’s, we can interpret every cograph-editing 
method as the removal of all P 4 ’s in the input graph with a minimum number of edits. A natural strategy is 
therefore to detect P 4 ’s and then to decide which ones must be edited. Optimal edit sets are not necessarily 
unique. A further difficulty is that editing an edge of a F 4 can produce new P 4 ’s in the updated graph. 
Hence we cannot expect a priori that local properties of G alone will allow us to identify optimal edits. 

By Remark [44| on the other hand, it is sufficient to edit within the prime modules. We therefore focus 
on the maximal modular partition Pmax = Pmax(G[-^]) of G[M], where M G MD(G) is a strong prime 
module of G. Since G[M]/Pmax is prime, it does not contain any twins. Now suppose we have edited G to 
a cograph Gopt using an optimally module-preserving edit set. Then Gopt[M] is a cograph and by Lemma 

twins {Mi^Mj}, where M; and Mj are, by construction, children of the prime module M G MD(G). 

This consideration suggests that it might suffice to edit the outM,- and out^y-neighborhoods in G in 
such a way that M, and Mj become twins in an optimally edited cograph Gopt. In the following we will 
show that this is indeed the case. 

We first show that twins are “safe”, i.e., that we never have to edit edges within a subgraph G[M] 
induced by an equivalence classes M of the twin relation 3". 

Lemma 4.5 Let G = {V,E) be a non-cograph, F be an arbitrary cograph edit set s.t. G' = {V,E AF) is 
the resulting cograph and suppose that G' A e is a non-cograph for all e G F. Then {x,y} f. F for twins 
x,y in G'. 

Proof: Since G' is a cograph it contains at least one pair of twins x,y. First assume that x and y are false 
twins and thus, xy ^ E{G'). Assume, for contradiction, that xy G F. By assumption, G' -\-xy is not a 
cograph and thus there is an induced P 4 containing the edge xy in G' -\-xy. All such Rj’s that contain the 
edge xy are (up to symmetries and isomorphism) of the form (\) a — x — y — b or (ii) x — y — b — a. Since 
X and y are false twins in G', we have Nqi (jc) = Nqi (y) and hence, there must be an edge xb G E[G') and 
therefore, xb G E [G' +xy). But this implies that G' -\-xy is still cograph, the desired contradiction. 

Now suppose that x and y are true twins, i.e., xy G E{G'). Assume, for contradiction, that xy G F and 
thus G' — xy is not a cograph. Hence there must be an induced P 4 containing x and y in G' — xy. All 
such Pfs containing x and y are (up to symmetries and isomorphism) of the form (i) x — a — y — b or 
(ii) X — a — b — y. Since x and y are true twins in G', we have Nqi{x) \ {y} = Nqi (y) \ {x}. This implies 
that there is the the edge xb G E{G' — xy) in both case (i) and (ii). Therefore G' — xy is a cograph, a 
contradiction. 

□ 


3.2 


the quotient Gopt [M] / Pmax is also a cograph. Therefore, Gopt [M] / Pmax contains at least one pair of 


Theorem 4.6 Let G = {V,E) be an arbitrary graph, Fopt be an optimal cograph edit set for G, Gopt = 
{y,E AFopt) the resulting cograph and ST be the twin relation on Gopt- Then for each equivalence class 
M^ fP it holds that: 
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(i) M is a module o/Gopt that does not contain any other non-trivial strong module o/M(Gopt)- 

(ii) Gopt[M] induces either an independent set K^m\ o- complete graph K^m\- 
(Hi) For all x,y € M it holds that {;c,y} ^ Fgpt and thus, G[M] ~ Gopt[M]. 


Proof: Statements (i) and (ii) are an immediate consequences of Proposition [3^ If e G Fopi then Gopt Ae 
is a non-cograph; otherwise ^opt \ {s} would be an edit set with smaller cardinality, contradicting the 
optimality of Fopt. Thus we can apply Lemma 4.5 to infer statement (iii). 

□ 


Corollary 4.7 Let G — {V,E) be a prime graph, Fopt be an optimal cograph edit set, Gopt = {V,E AFopt) 
the resulting cograph and SF be the twin relation on Gopt- Then V ^. 

Proof: As G = (V,£) is a prime graph we know that Fopt 7 ^ 0- IfV ^ T, then Theorem |4.6| implies that 
{x,y} ^ Fopt for all x,y G V and hence, Fopt = 0, a contradiction. □ 

The following definitions are important for the concepts for the “module merge process” that we will 
extensively use in our approach. 

Definition 2 (Module Merge) Let G and H be arbitrary graphs on the same vertex set V with their 
corresponding sets of all modules M(G) and M(//), resp. We say that a subset M' = {M \,... C 
M(G) of modules is merged (w.r.t. H) - or, equivalently, the modules in M' are merged (w.r.t. H) - if (a) 
each of the modules in M' is a module of H, and (b) the union of all modules in M' is a module of FI but 
not of G. More formally, the modules in M' are merged (w.r.t. FI), if 

(i) Mi,...,M^gM(//), 

(ii) M = G M(//), and 

(iii) M ^ M(G). 

If {Ml ,..., A4} C M(G) is merged to a new module M G M(//) we will write this as Mi l±l... bd M^ = 

When modules Mi,... ,Mk of G are merged w.r.t. H then all vertices in M = must have the 

same out^-neighbors in H, while at least two vertices x G M,-, y G Mj, 1 < i f j <k must have different 
out^f-neighbors in G. 

Definition 3 (Module Merge Edit) Let G = {V,E) be an arbitrary graph and F be an arbitrary edit set 
resulting in the graph H = {V,E AF). Assume that G M(G) are modules that have been 

merged w.r.t. H resulting in the module M = G M(//). Then 

Fni^'l^iMi ^ M) = {{x,v) & F \ xGM,v^M} (1) 

The edit set F//(l±l^^jM, —b M) comprises exactly those (non)edges of F that have been edited so that all 
vertices in M have the same outM-neighborhood in H. In particular, it contains only (non)edge of F that 
are not entirely contained in G[M]. 
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G[M] 



G[M]/Pr, 


Ml M2 Mb M4 Ms 


Gopt[M] 



MDT of G[M]/P max 



Ml M2 Mb M4 Ms 


MDT of Gopt[M]/P max 



Fig. 1 : Assume we are given a non-cograph G that contains a strong prime modules M G MD(G) so that G\M\ is 
the graph in the upper left part of this picture. Moreover, assume there is an optimal module-preserving edit set ^opt 
transforming G to a cograph Gopt so that {3,5}, {4,5} G Fopt- Hence, Gopt [M\ is the graph in the upper right part. 
Let Pmax = Pmax(G[M]) = {M \,... ,Mf\ be the maximal modular partition of G[M]. The modules Mi,... ,M^ are 
highlighted as gray parts in G[M] and Gopt[M]. Now, G[M]/Pniax is prime and contains no twins, while Gopt[M]/Pniax 
is a cograph and contains therefore twins. The twin relation ^ on Gopt [M]/Pmax has equivalence classes N = 
{Ml,M 2 }, N' = {M 4 } and N" = {M 3 ,M 5 }. Hence, the modules Mi and M 2 , as well as, M 3 and M 5 are merged 
w.r.t. Gopt[M]/Pmax- Therefore, the modules Ny = Mi UM 2 = {1,2,3,4} and Ny = M 3 UM 5 = {5,8} have been 
obtained by merging modules of G[M] w.r.t. Gopt[M] and in particular, w.r.t. Gopt. In symbols. Mi 1 ±IM 2 —> Ny and 
M 3 btlMs —!■ Ny. 

Therefore, instead of focussing on algorithms that optimally edit induced /Ij’s one can equivalently ask for optimal 
edit sets that resolve such prime modules M, that is, one asks for the minimum number of edits that adjust the 
neighborhoods of modules that are children of M in MDT(G) so that these modules become twins until the module 
M becomes a non-prime module, see Theorem|4.10| 
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Lemma 4.8 Let M be a strong prime module of a given graph G = iV,E) and Pmax be the maximal 
modular partition ofG[M]. Moreover, let F be an arbitrary edit set resulting in the graph H = (y,E AF). 
Assume that M = {Mi,.. .Mf.} C Pmax C M(G) is a set of of modules that are merged w.r.t. FI. 

Then for any distinct indices a,b,c G {1,..., it holds that Ma l±l Mi, —>■ Mat, Mi, l±l Ma —>■ and 

Mat = Mha in F[. Moreover, Ma l±l {Mi, btlMc) —^ Ma{i,c) {Ma '^Mi,) LdMe —>■ M^aby satisfy Ma(yi,c) = 

M^ab)c’ the merge operation is associative and commutative. The merging of any subset M of modules 
w.r.t. a graph H therefore is well-defined and independent of the individual merging steps. 

Furthermore, 


F{MatiiMi, btIMc ^ Make) =F{MatiiMi, Mab) 

w {F{Mab^Mc Mabc)\F{Ma ttIMfo ^ Mab)) 


with MatiiMi,^ Mab- 

Proof: For commutativity, we show first that Ma tiiMi, — > Mab for every pair of distinct a, by {1,..., A:} By 
definition of “bd” we have Ma,Mi, € M{H), and thus. Condition (i) of Def.l2|is satisfied. Moreover, Thm. 


(ii) we have to show that MaUMi, = Mab G M.{H). Assume for contradiction that Ma UMj, ^ M(//), then 
there is a vertex v € V so that v is in the outM^-neighborhood, but not in the out^j-neighborhood w.r.t. H, 
or vice versa. However, this remains true if we consider which implies that ^ M.{H), and 

hence M = {Mi,... ,Mi-} is not merged w.r.t. H, a contradiction. Finally, Ma tiiMi, —b Mab implies that 
Mab = Mfl UM;, = Mh UMa = Mi,a, and thus l±l is commutative. 

By similar arguments one shows that Condition (i), (ii), and (iii) of Def.j^are satisfied forMaliJM^LtlMc. 
Hence, since Ma td M;, l±l M^ —)■ Mabc implies that Mabc = Ma UM;, UMa, associativity of l±l follows again 
directly from the associativity of the set union. 

To see that the last property for F is satisfied, note that 


3.6 (T5) implies that Ma UM^ ^ M(G) and hence. Condition (iii) of DefTSis satisfied. For Condition 


F{Ma tdMfo l±IMa ^ Mabc) 

= {(jC,v)€F I ^^MabcA^Mabc{ 

= {{x,v) y F I X y Mab,'^ Mab{ G {{x,v) y F | X y Mabc\MabA ^ Mabc\Mab{ 
= F{Ma iUMb-t Mab) C {F{Mab tiiMc-t Mabc) \ F(Ma tdM^ ^ Mab)) 


□ 


Let G be an arbitrary graph and Fopt be a minimum cardinality set of edits that applied to G result in 
the cograph Gopt. We will show that every module-preserving edit set Fopt can be expressed completely 
by means of module merge edits. To this end, we will consider the strong prime modules M y MD(G) of 
the given graph G (in particular certain submodules of M that do not share the same out-neighborhood) 
and adjust their out-neighbors to obtain new modules as long as M stays a prime module. This procedure 
is repeated for all prime modules of G, until no prime modules are left in G. 

As mentioned above, if G is not a cograph there must be a strong prime module M in G. Let Pmax 
be the maximal modular partition of G[M]. Theorem 3.6 implies that G[M]/Pmax is prime and thus, 
does not contain twins. However, if Fopt is module preserving, then Lemma [3^ implies that the graph 
Gopt[M]/Pmax is a cograph and thus, contains non-trivial twins by Theorem|^ 
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Hence, we aim at finding particular submodules Mi,M2, ■ ■ ■ ,Mi^ C M in G that have to be merged so 
that they become twins in Gopt[M]/Pniax- As the following theorem shows, repeated application of this 
procedure to all prime modules of G will completely resolve those prime modules resulting in Gopt- An 
illustrative example is shown in Figure [T] 

For the proof of the hnal Theorem |4.10| we first establish the following result. 

Lemma 4.9 Let G = {V,E) be graph, F^pi be an optimal module-preserving cograph edit set, and Gopt = 
{y,E AFopt) be the resulting cograph. Furthermore, let M be an arbitrary strong prime module of G 
that does not contain any other strong prime module, and Pmax := ]Pmax(G[4/]) = {M\,... ,Mjf\ be the 
maximal modular partition ofG[M]. Moreover, let F' = {{x,y} € Fopt | 3Mi,Mj € F’opt(A/,Pmax) withx € 
Mi,y € Mj}, where F’opt(M,Pniax) A the edit set to implied by Fopt on G[M]/Pniax defined in Remark 
Then, every strong prime module of H = (y,E AF') is a strong prime module of G. Moreover, 
V’miixiH[M']) = ^raay.{G[M'\) holds for each strong prime module M' ofFI. 


4.4 


Proof: First note, that there is no other module M' AM containing induced Pfs because M does not 
contain any other strong prime module M' AM and because by Theorem 3.6 (T5) the union AM'eP'^’ 
is not a module of G for any of the subsets P' A Pmax- Together with Lemma 4.2 this implies that F' = 


{{.r,y} G Fopt | x,y G M}. Thus, H[M] is a cograph as otherwise, there would be induced F 4 ’s contained 
in H\M] and there are no further edits in ^opt \ F' to remove these F 4 ’s. Thus, such F 4 ’s would remain in 
Gopt, a contradiction. 

Let M' be an arbitrary strong prime module of H. Note, since F' does not affect the outM-neighborhood, 
M is still a module of H. Since M' is a strong module in H we have, therefore, either, M' AM, M AM' or 
MnM' = 0inG'. 


Assume that M' A M. Since M' is prime in H it follows that H\M'] is a non-cograph, a contradiction, 
since H\M'] is an induced subgraph of the cograph H\M]. Hence, the case M' A M cannot occur. If 
M AM' or MAM' = then the out^^z-neighborhood of any vertex contained in M' have not been affected 
by F', and thus, M' is also a module of G. 

It remains to show for the cases M A M' or MAM' = % that the module M' of G is also strong and 
prime. If MHM' = 0, then F' does not affect any vertex of M' and hence, M' must be a strong prime 
module of G and, in particular, ^m&x{H[M'\) = Pniax(C?[4^1)- 

Let MAM' and assume for contradiction that M' is not strong in G. Thus, M' must overlap some 
other module M" in G. However, since M is a strong module of G, M cannot overlap M" and hence, 
MAM" = 0. However, as F' does only affect the vertices within M and, in particular, none of the vertices 
of M" and since M' and M" overlap in G, they must also overlap in H, a contradiction. Thus, M' is a 
strong module of G. 

Furthermore, let again MAM' and assume that M' is not prime in G. Now, let P^^^ be the maximal 
modular partition of G\M']. Since M and M' are strong modules in G and M A M', we have either M G P^ax 
or M C M" G Pjnax some strong module M" G Pmax- Hence, all modules of Pmax ^e entirely contained 
in the modules of Pmax- particular, this implies that we have not changed the out^^/z-neighborhood 
for any M" G Pmax’ therefore, the maximal modular partition of G[M'] is also the maximal modular 
partition of H[M'], i.e., VmaxiH[M']) = Pmax(G[4/'])- However, if M' is not prime in G, then G[M']/Pj„ax 
is either totally disconnected {M' is parallel) or a complete graph (M' labeled series), while H[M']/V'^^,^ 
is prime (hence, it does contain only trivial modules). Clearly, if G[M']/P^ax A totally disconnected 
or a complete graph, any subset P' A P forms a module and thus, G[M']/P^ 3 ^ does not contain only 
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trivial modules. In summary, G\M']/qk ^ contradiction, since we have not changed the 

out^//-neighborhood for any M" € Pjnax- 

Theorem 4.10 Let G — (y,E) be graph, F^pi be an optimal module-preserving cograph edit set, and 
Gopt = iy,E AFopt) be the resulting cograph. Denote by MDP(G) C MD(G) the set of strong prime 
modules of G. 

Furthermore, let M,- denote a set of modules that have been merged w.r.t. Gopt resulting in the new 
module .Mi = UmgM/^ £ M(Gopt), where M,- is a subset of some ¥^iaiG[M']) for some M' G MDP(G). 
In other words, each M; contains only the children of strong prime modules ofG. 

Let Ml,..., Mr be all these sets of modules of G that have been merged w.r.t. Gopt into respective 
modules Mi,... ,Mr and M = \^M\,... ,.Mr\ denote the set of these resulting new modules in Gopt- 
Then 

^’opt = := y Gg„p, ^I±Imgm,4^ M^ 


Proof: The proof follows an iterative process, starting with the input graph G^ — G, and proceeds by 
stepwisely editing within certain strong prime modules M resulting in new graphs G',G2,...,G" = Gopt 
and ^opt = for some integer n. In each step, which leads from a non-cograph G^ to G*+*, we operate 
on one strong prime module M G MD(G^) of G^ that does not contain any other stron g pr ime module. 
We write := Pmax{G^[M]) for the maximal modular partition o f G^ \M]. Theorem [3.6| (T4) implies 
that G^[M]/Py, is prime and thus does not contain twins. Lemma 
cograph and hence, by Theorem|3.5|contains twins. 


3.2 


implies that Gopt[M]/Py,; is a 


ik 

max 


We will show in PART 1 that each equivalence class N of the twin relation N '= fP on Gopt[M]/P| 
yields a set of vertices Ny C V, where each Ny is a module of Gopt but not of G^. In particular, every 
such Ny is obtained by merging modules of G^ only so that only Ny is affected. Application of merge edit 
operations contained in Gopt to obtain these new modules Ny results in the new graph G^+'. The procedure 
is then repeated unless the new graph G^+^ equals Gopt- 

We then show in PART 2, that none of the new modules Ny is a module of the starting graph G. 
Furthermore we will see that Ny is obtained not only by merging modules of G^ but also by merging 
modules of G. Finally we use these two results to show that Topt — F.J^- 


PARTI: 

We start with the base case, and show how to obtain the graph G* from G^ := G. Set G** = Gopt- W.l.o.g., 
assume that G** is a not a cograph and thus that it contains prime modules. Let M G MD(G®) be a strong 
prime module of G^ that does not contain any other strong prime module. 

By the preceding arguments, G''[M]/P|^ 3 x prime and thus does not contain twins, while 
G^opt M/pLx is a cograph and contains twins. In particular, there must be vertices (representing strong 
modules of G**) in G^[M]/F^„ that become twins in GontM/Pmax- Let M be the twin relation of 
Gc,., „ 

N, i.e., Ny := ^ Mj}. By Theorem 

Lemmaimplies that P|J,ax ^ modular partition of Gopt and thus we can apply Theorem |3.6| (T2) to 
conclude that Ny is a module of Gopt[M]. Since Gopt is module preserving, it follows that M is a module 
of Gopt and thus, by Lemma 3.1 Ny is a module of Gopt- Corollary |4^ furthermore implies that Ny fM 


opt [M\ / Pmax denote, for a given equivalence classes N £ fP, by Ny C V the set of twins contained in 


4.6 


each N E M is a module of Gopt [M] / 
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for all N ^ Moreover, by Theorem |3.6| (T 5), n one of the equivalence classes N ^ T with |A^| > 1 are 
modules of Henc e, by Theorem ! 


3.6 


(T2), Nv is not a module of for all e with 

|A^| > 1 and hence. Lemma [ tT| implies that Nv is not a module of for all e with |A^| > 1. Since 
each equivalence class N ^ fP with |A^| > 1 comprises modules of G^ = G, and for the respective vertex 
set Ny we have both Nv fz. M(G**) and Ny € M(Gopt), the modules contained in N are merged to Ny w.r.t. 
Gopt, in symbols —>■ Ny for all A^ e fp with |A^| > 1. 


We write ./#* := {Ny \ Ny = Uu/eNMiiN e |A^| > 1} for the set of all new modules Ny that are 
the result of merging modules of G^[M\. We continue to construct the graph G^ To this end, we set 
= ■P’opt and show that the subset C ^’opt, where E^ contains all pairs of vertices {.r,y} with x G Ny 



and all Mi,Mj G N. However, since G^[M\/f'^^^ does not contain any pair of twins, while Gopt[M]/P[J, 3 x 
does, it must have F{fT) f 0. Hence, E^ = GF^ \ xG Mi,y G M,-, G F{,fTy\ f 0. Theorem 

x^y G Ny. Since E'^ f 0, it contains only pairs of vertices {x,z} with x G Ny C M and z G M\Nv. In 
other words, E^ comprises only pairs of vertices {x,y} C M that have been edited to merge the modules of 
G^[M] resulting in the new modules Ny G . Note that there might be additional edits {x,y} ^ T^pt \ F 
for some xGNy andy GV\Nv. Nevertheless we have E^ C Ny) C F^. 


4.6 


implies {x,y} E^ for all x G Mj,y G Mj, all Mi,Mj G N and all N ^ Thus, {x,y} ^ E^ for all 


Finally, set G* = (y,E AE'^). Thus, all Ny G are modules of G'. If G* = Gopt, then E^ = Fopt. By 
construction E'^ G_Fj( G_ ^’opt- Hence we can conclude that = ■f’opt and we are done. 

Now we turn to the general editing step. If G^, A: > 1 is not a cograph, then we dehne the set F^ = 
We can re-use exactly the same arguments as above. Starting with a strong prime module 
M G MD(G^) (that does not contain any other strong prime module), we show that the resulting set E^ 
is not empty and comprises (non)edges in E^C Fopt so that F* C F^. In particular, we hnd that E^ 
contains only the (non)edges that have been edited so that for all new modules Ny G of the graph 

G^+' _ (y,F A (Uj^qE-^)) it holds that Ny ^ M(G^), Ny G M(Gopt) and that Ny is the union of modules 
contained in M(G^) that are, in particular, children of the chosen prime module in G^. 


PART 2: 

It remains to show that every Ny G has the following two properties: 

(a) Ny ^ M(G); and 

(b) Ny is a module that has been obtained by merging modules that are children of prime modules of G 
and not only by merging such modules of G^. 

We first prove the following 

Claim 1: Any module of G*' = G is also a module of G^. 

Proof of Claim 1. 

We proceed by induction. Let k = \ and thus, G^ = (y,F A F**). Let M* be a module of G^ = G. If 
{x,a} ^ E^ for all vertices x G M* then M* is a module in G* because the out^*-neighborhood in G* is 
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not changed. Now assume that there is some vertex x G M* with {x, a} If all such a with {x, a} G 
are also contained in M*, then M* is still a module in G'. Therefore, in what follows assume that a ^ M*. 

Let M be the strong prime module that has been used to edit to in our construction above. Since 
for all {x,fl} G we have by construction x,a G M, we can conclude that MCiM* ^ 0. Since M is a 
strong module we have M C M* or M* C M. However, from a G M a nd a ^ M* it follows that M* C M. 
Write for maximal modular partition of G^ [M\ . Theorem 3 .6 (T5) implies that for any non-trivial 
subset P' of P[J,ax rh® union of elements cannot be a module of G^. In other words, M* is not the union 
of elements of any such subset P'. Therefore, if M* C M, then M* = M, for some module M, G P^ax- ff 
Mi = {x}, thenM, trivially remains a module of G^ Hence assume |M, | > 1. By construction M, C Ny for 
some Mf G N 3'. There are two cases; (1) \N\ > 1 and thus M,- C Ny, and (2) \N\ — 1, i.e., N = {M,}, 
and thus Ny = Mi is not merged. 


Case (1) Assume first that the edge (x,fl) G E^ with x G M, and a ^ M, was added. Since Fopt is module 
preserving, Ny G M(Gopt). Hence, by construction of E^, we can conclude that all vertices y G 
Ny CM must be adjacent to a in G' [M], Since M, C Ny, all y G M,- must be adjacent to a in G* [M\. 
Analogously, if the edge (x, a) has been removed, then all y G M, must be non-adjacent to a. Hence, 
elements in Mi have the same out^,-neighborhood in G*[M] and thus, M, is a module of G*[M]. 
Sinc e our construction does not change the out^-neighborhood, i.e., M is a module of G\ Lemma 
|3.l| implies that M, is also a module of G'. 

Case (2) Recall first that E^ comprises only pairs of vertices {u,v} C M that have been edited to merge the 
modules of G^[M] resulting in the new modules in . Since Ny is not merged we have Ny ^ 
and thus, we cannot directly assume that {y,a} with y G Ny = Mi are in E^. However, as there was 
some edit {x,a} G E^ there must be some other merged module Ny G with a GNy. Moreover, 
since Lopt is module preserving it follows that M, is a module of Gopt. Hence, if {x,a} G £’*' C Gopt 
was added, then all pairs of vertices {y,a} with y G Mi not adjacent to a must be contained in Gopt 
and thus, in particular, a G Ny, which implies {y,a} in E^. Analogous arguments apply if {x,a} 
was removed. Therefore, either all vertices in M, are adjacent to a or all of them as nonadjacent to 
a in G* [M], Therefore M, is a module of G^ [M], As in case (1) we can now argue that M, is also a 
module of G^ 


To summarize, all modules of G*' = G are modules of G'. Assume the statement is true for k and assume 
that G* is not a cograph (since there is nothing more to show if G^ is a cograph). Applying the same 
arguments as above, we can infer that every module of G^ is also a module of G^^'. Since all modules of 
G are by assumption also modules of G^, they are also modules of G^+^ o 


Proof of Statement (a). 

By Claim L, every module of G is a module of G*. Thus, if there is a subset M CV that is not a module 
of G^, then M is not a module of G. Since we have already shown that all modules Ny G of G^+* 

are not modules of G*, we conclude that none of the modules Ny G can be a module of G. This 

implies statement (a). o 


Proof of Statement (b). 

By Lemma 4.9 and construction of G^, we find that all strong prime modules of G^ are also strong prime 


modules of G 
of G. 




Therefore, by induction, any strong prime module of G* is also a strong prime module 
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It remains to show that the children of the chosen strong prime module M in are also the children of 
M in G. In other words, we must show that for the maximal modular partitions we have that Pmax {G[M]) = 
Pmax(G*[A^])) for each k> 1. 

We again proceed by induction; By construction, if M is the chosen strong prime module of G^ that 
does not contain any other strong prime module in G^, then only children of M in G*' are merged to 
obtain G^. Therefore, let M be the chosen strong prime module of G^ that do es no t contain any other 
strong prime module in G* and that is used to obtain the graph G^. Lemma 4.9 implies that M is a 
strong prime module of G and Pmax(G[4^]) = Pmax(G**[4^]) = Pmax(G* M)- Thus, only children of prime 
modules of G have been merged to obtain the graph G^. Assume the statement is true for k. By analogous 
arguments as in the step from G** to G\ we can show that for the chosen strong prime module M in G^ that 
does not contain any other strong prime of G* to obtain G*+*, we have Pmax(G^ * = Pmax (G*[47]). 

Thus, only children of prime modules of G^^* have been merged to obtain the graph G^+'. However, 
since each strong prime module of G^ is a strong prime module of G and since by induction hypothesis 
Pmax(G^^* [M]) = Pmax(G[M]) we obtain that Pmax(G^[M]) = Pmax(G[M]), and thus only children of 
prime modules of G have been merged to obtain the graph G*+^ o 


Finally, recall that C {{jc,y} S Fopt | v € Nv,y € V\Nv,Nv € = UivE.3^4^Gopt(WM,eiv47,' 

Nv) C F^, where ft' is the twin relation applied to Gopt[M]/P^2x of the chosen strong prime module 
M in G^ that does not contain any other strong prime module. By construction and because any module 
contained in = {Nv S M-> | 1 < y < A:} is obtained by merging the children of strong prime modules 
G only and since the children of strong prime modules are in particular modules of G, we have U/=i C 

Uj=i {JNye^j ^Gopti^MeNM —>■ Nv) L F^, where N comprises all modules that have been merged to 
obtain the respective module Nv contained in 

This iterative process may lead to the situation that C where ^ denotes the set of all modules 
of Gopt that have been obtained by merging modules of G. However, since E^ C Fopt and, in particular, 
7 ^ 0 if and only if G^ is a non-cograph (and thus contains prime modules), and because | < |F^|, 
the iteration necessarily terminates for some finite n with F” = 0 and thus G" = {V,E A (U^^jF-^)) = Gopt 
and U'LjF' = Fopt. Since UjLjF' C F^i C F^ C Fopt we can finally conclude that F^ = Fopt. □ 


Theorem 4.10 


implies that every module-preserving optimal edit set Fopt (which always exist) can be 
expressed as optimal module merge edits. Hence, instead of editing the induced F 4 ’s of a given graph G 
directly, one can equivalently resolve the strong prime modules by merging their children until no further 
prime module is left. At the first glance this result seems to be only of theoretical interest since the 
construction of optimal module merge edits is not easier than solving the editing problem. 


The proof is constructive, however, and implies an alternative exact algorithm to solve the cograph edit¬ 
ing problem, this time based on the stepwise resolution of the prime modules. Of course it has exponential 
runtime because in each step one needs to determine which of the modules have to merged and which of 
the (non)edges have to edited to obtain new modules. In particular, there are 2" possible subsets for a 
prime module with n children in MDT(G) that give all rise to modules that can be merged to new ones. 
Moreover, for each subset of modules that will be merged, there are exponentially many possibilities in 
the number of vertices to add or remove edges. 
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5 A modular-decomposition-based Heuristic for Cograph Editing 


The practical virtue of our result is that it suggests an alternative strategy to construct heuristic algorithms 


for cograph editing. Our starting point is Lemma 4.8 which implies that we can construct Fopt by pairwise 
merging of children of prime modules. 

Assume there is strong prime module M and M denotes a subset children of M w.r.t. MDT{G) that 
have been merged w.r.t. Gopt. Now, instead of merging all modules at once, one can perform the merging 
process step by step. Take e.g. Mi and M 2 of M and define F 12 C Fopt as the set of all edits that have been 
used to merge Mi and M 2 so that they become twins in Gopt[M], delete Mi from M and replace M 2 by 
Ml UM 2 in M and Fgpt by Fopi\Fi 2 - Now all vertices in Mi UM 2 have the same outMiUM 2 -neighbors in 
G[M]AFi 2 - Repeating this procedure reduces the size of M by one element in each round and eventually 
terminates with T'(M) := C Fopi. 

This strategy can be applied to all sets M* = M,M^,.. .M'" that contain modules that are children of 

M in MDT(G) and that have been merged to some new module ,/#' = _respectively. 

Hen ce, Uj^jT'(M') C T’opt[M], where T’opt[M] contains all (non-)edges {x,y} G Lopt with.r,y G M. Lemma 
|4.2| implies that T’opt[M] is optimal in G[M\ and hence, the modules in M* ,M^,.. .M'" cannot be merged 
with fewer edits than in FoptM- R follows that U'LjT’(M') = FoptM- Thus, there is always an optimal 
edit set Fopi that can be expressed by means of successively pairwise merge module edit, as done above. 
Algorithm[2summarizes this approach in the form of pseudocode. 


Algorithm 1 Simple Cograph Editing Heuristic. 

Two functions, get-module () and get-module-pair-edit (), influence the practical ejflciency, 
see text for details of their specification. 

1: INPUT: A graph G = (y,£); 

2: Compute MD(G) 

3: while M =get-module (MD(G)) do 
4: if G[M] is a spider then 

5: Optimally edit G[M] to a cograph by application of EDP4 OLWGCl 1IILWGC12II 

6 : else 

7: Let Ml,... ,Mk be the children of M in MDT(G), i.e., {Mi,... ,Mk} = Pmax (G[M]). 

8: {Mi,Mj,Fj j) = get-module-pair-edit ({Mi,... ,Mk}) 

9: G^{V,E{G)AFij) 

10: end if 

11 : end while 

12: OUTPUT: The cograph G*; 


Algorithm[T]contains two points at which the choice a particular module or a particular pair of modules 
affects performance and efficiency. First, the function get-module {) returns a strong prime module 
that does not contain any other prime module and returns false if there is no such module, i.e., if G 
is a cograph. Second, subroutine get-module-pair-edit () extracts from {Mi,... ,Mk} a pair of 
modules. Ideally, these should satisfy the the following two conditions; 

(i) Mi and Mj have a minimum number of edits so that the outM,uMpneighborhood in G[M] becomes 
identical for all x,y G MiUMj among all pairs in Pjnax(GM)j and 
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(ii) additionally maximizes the number of removed P 4 ’s in G[M] after applying these edits. 
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Lemma 5.1 If get-module-pair-edit () is an “oracle” that always returns a correct pairs Mi and 
Mj together with the respective edit set Fi j that is used to merged them and get-module () returns and 
arbitrary strong prime module that does not contain any other prime module, then Alg. ^computes an op¬ 
timally edited cograph Gopt in 0{pAh{n)) < 0{n^h{n)) time, where p denotes the number of strong prime 
modules, A = max|Pniax(^)| among all strong prime modules of G, and h(n) is the cost for evaluating 
get-module-pair-edit (). 


Proof: The correctness of Algorithrrl^follows directly from Lemma 4.8 and Theorem 4.10 


The modular decomposition MD(G) can be computed in linear-time, see IICH941 IDGMOll IMS94I 
iM^ITCHPOSl . Then, we have to resolve each of the p modules and in each step in the worst case 
all modules have to be merged stepwisely, resulting an effort of G(| Pmax |) merging steps in each itera¬ 
tion. Since p <n and A < n we obtain 0{n^h{n)) as an upper bound. □ 


In practice, the exact computation of the optimal editing pairs requires exponential effort. Practical 
heuristics for get-module-pair-edit (), however, can be implemented in polynomial time. A 
simple heuristic strategy to find those pairs can be established as follows: Mark all of the G(A^) pairs 
{Mi,Mj) where the set L = Nq' \ [Mi CMf) of distinct outM,-- and outMpneighbors that are not 

contained in M; and Mj has minimum cardinality. Removing, resp., adding all edges xy with x G MiUMj, 
y GF would yield a new module MiUMj in the updated graph. Among all those marked pairs take the pair 
for a final merge that additionally removes a maximum number of induced R^’s in the course of adjusting 
the respective out-neighborhoods. This amounts to an efficient method for detecting induced R^’s. A 
detailed numerical evaluation of heuristics for cograph editing will be discussed elsewhere. 
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